UNCLASSIFIED 


AD  NUMBER 


AD911417 


NEW  LIMITATION  CHANGE 
TO 

Approved  for  public  release,  distribution 
unlimited 


FROM 

Distribution  authorized  to  U.S.  Gov't, 
agencies  only;  Test  and  Evaluation;  MAR 
1973.  Other  requests  shall  be  referred  to 
Army  Electronics  Command,  For  Monmouth,  NJ 
07703. 


AUTHORITY 


USAEC  ltr,  4  Mar  1974 


THIS  PAGE  IS  UNCLASSIFIED 


] 


jf  '■^0 


05 

Q 


[g 


Research  and  Development  Technical  Report 
EC0M-0292-6 


ANALYTIC  MATHEMATICAL 
MODELS  OF  TACTICAL 
MILITARY  COMMUNICATIONS 
CHANNELS 


QUARTERLY  REPORT 

MAY,  1973 


R. T.  CHIEN 
C.L.CHEN 

S.  TSAI 


DISTRIBUTION  STATEMENT 


Distribution  limited  to  U.S,  Government  agencies  only  j 
Test  and  Evoluotion;  March  73.  Other  requests  for 
this  document  must  be  referred  to  Commanding  General 
US,  Army  Electronics  Command,  ATTN:  AMSEl-NUR-2, 
Fort  Monmouth,  New  Jersey  07703 


UNITED  STATES  ARMY  ELECTRONICS  COMMAND  -  TORT  MONMOUTH,  N.J. 

Contract  DAAB07-71-C-Q292 
Coordinated  Science  Laboratory 
University  of  Illinois 
Urbono,  Illinois  61801 


NOTICES 

Disclaimers 

The  findings  in  this  report  are  not  to  be  construed  as 
an  official  Department  of  the  Army  position,  unless  so  desig¬ 
nated  by  other  authorized  documents. 

The  citation  of  trade  names  and  names  of  manufacturers 
in  this  report  is  not  to  be  construed  as  official  Government 
indorsement  or  approval  of  commercial  products  or  services 
referenced  herein. 

Disposition 

Destroy  tills  report  when  it  Is  no  longer  needed.  Do 
not  return  it  to  the  originator. 


'i*  Sotw/nnw 


TR  ECOM-0292-6 
May  1973 


Reports  Control  Symbol 
OSD-1366 


ANALYTIC  MATHEMATICAL  MODELS  OF  TACTICAL 
MILITARY  COMMUNICATIONS  CHANNELS 


SIXTH  QUARTERLY  PROGRESS  REPORT 
1  October  1972  -  31  December  1972 


Contract  No.  DAAB07-71-Cr-0292 
DA  Project  No.  IS6.62701.A327.06.Q7 


DISTRIBUTION  STATEMENT 

Distribution  limited  to  U.S.  Government 
agencies  only;  Test  and  Evaluation; 
March  1973.  Other  requests  for  this 
document  must  be  referred  to  Commanding 
General,  U,S,  Army  Electronics  Command, 
ATTNj  AMSEL-NL-R-2 ,  Port  Monmouth,  New 
Jersey  07703 


Prepared  by 

R.  T.  Chien 
C.  1..  Chen 

S.  Tsai 


Coordinated  Science  Laboratory 
University  of  Illinois  at  Urbana-Champaigu 
Urbane,  Illinois  61801 


For 

U.  S.  ARNY  ELECTRONICS  COMMAND,  PORT  MONMOUTH,  N.  J 


ABSTRACT 


The  Viterbi  decoding  algorithm  yields  minimum  probability  of 
error  when  applied  to  a  memoryless  channel  provided  that  all  input  sequences 
are  equally  likely.  In  this  report,  the  algorithm  was  generalized  for 
application  to  channels  with  finite  memory  and  it  was  shown  that  the 
generalized  algorithm  is  also  maximum-likelihood  decoding.  It  was  also 
shown  that  the  generalized  Viterbi  algorithm  on  a  simple  memory  channel 
performs  better  than  the  original  Viterbi  algorithm  with  the  same  decoding 
complexity. 

The  M-state  Markov  model  was  reviewed  in  this  report.  The 
process  of  identifying  the  parameters  of  the  M-state  model  from  the 
coefficients  A^  and  A^n.  ,n.+.)  of  the  gap  model  was  determined  to  be 
more  complicated  than  was  anticipated.  As  an  alternative,  the  simple 
partitioned  Markov  model  was  examined  to  determine  the  effect  of  the 
second  order  statistics,  namely  the  interdependence  of  the  gaps,  on  the 
error  burst  distribution.  An  alternative  definition  of  the  burst  was 
adopted  to  speed  up  this  investigation.  The  difference  or  similarity 
between  these  two  definitions  will  be  determined. 
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SECTION  1 
SUMMARY 


The  Viterbi  decoding  algorithm  yields  minimum  probability  of 
error  when  applied  to  a  memoryless  channel  provided  that  all  input  sequences 
are  equally  likely.  In  this  report,  the  algorithm  was  generalized  for  applica¬ 
tion  to  channels  with  finite  memory  and  it  was  shown  that  the  generalized 
algorithm  is  also  maximum-likelihood  decoding.  It  was  also  shown  that  the 
generalized  Viterbi  algorithm  on  a  simple  memory  channel  performs  better 
than  the  original  Viterbi  algorithm  with  the  same  decoding  complexity. 

The  M-state  Markov  model  was  reviewed  in  this  report.  The 
process  of  identifying  the  parameters  of  the  M-state  model  from  the 
coefficients  and  A^(n.,n.+j)  of  the  gap  model  was  determined  to  be 
more  complicated  than  was  anticipated.  As  an  alternative,  the  simple 
partitioned  Markov  model  was  examined  to  determine  the  effect  of  the 
second  order  statistics,  namely  the  Interdependence  of  the  gaps,  on  the 
error  burst  distribution.  An  alternative  definition  of  the  burst  was 
adopted  to  speed  up  thi3  investigation.  The  difference  or  similarity 
between  these  two  definitions  will  be  determined. 
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SECTION  2 


VITERBI  DECODING  ALGORITHM 


2.1.  INTRODUCTION 

In  1955  Elias  [1]  introduced  a  new  class  of  codes,  called  con¬ 
volutional  codes  (sometimes  called  recurrent  codes),  which  has  become  an 
important  alternative  to  the  block  coding  scheme.  Unlike  the  block  coding 
scheme  which  divides  a  string  of  information  digits  into  blocks  of  k  digits 
each  and  encodes  these  blocks  into  blocks  of  codewords  of  n  digits  each, 
the  convolutional  coding  scheme  takes  a  string  of  information  digits  of 
arbitrary  length  (can  be  semi-infinite)  and  encodes  it  into  a  single  string 
of  coded  digits.  More  on  the  structure  of  the  convolutional  code  will  be 
discussed  in  the  next  section. 

On  a  memoryless  channel,  there  are  three  different  procedures 
for  decoding  convolutional  codes,  viz.  threshold  decoding,  sequential 
decoding,  and  Viterbi  (maximum-likelihood)  decoding.  Sequential  decoding 
was  first  introduced  in  1957  by  Wo2encraft  [2],  It  can  be  applied  to  any 
convolutional  code,  however  the  complexity  of  its  decoding  computations 
is  not  fixed  and  can  result  in  long  delays  in  decoding.  Threshold  decoding 
was  introduced  by  Massey  [3]  in  1963.  It  is  a  sub-optimum  decoding  procedure 
and  its  applicability  is  dependent  on  the  individual  code.  In  1967,  making 
use  of  the  fact  that  an  information  digit  can  affect  the  coded  digits  in 
only  a  finite  number  of  subsequent  time  periods,  Viterbi  proposed  a  new 
decoding  algorithm  (4)  for  decoding  any  convolutional  code  in  the  presence 
of  noise  due  to  a  memoryless  channel.  His  algorithm  is  a  welcome  alterna¬ 
tive  to  the  threshold  decoding  and  sequential  decoding  algorithms.  Unlike 
sequential  decoding,  the  computational  complexity  for  decoding  a  digit  in 
the  Viterbi  algorithm  is  fixed,  Furthermore  it  was  proved  later  (5]  that 
the  Viterbi  algorithm  is  actually  a  maximum  likelihood  decoding  procedure. 
Practical  decoders  based  on  the  Viterbi  algorithm  have  actually  been  built 
and  tested  (6,7] . 

The  Viterbi  algorithm  is  a  powerful  procedure  for  decoding  any 
convolutional  code  on  a  memoryless  channel.  For  such  channels  the  decoding 
procedure  yields  the  minimum  probability  of  error  provided  all  input 
sequences  are  equally  likely  and  is  therefore  optimum  in  this  sense. 
Unfortunately,  except  for  the  space  channel,  most  of  the  real  channels 
on  earth  are  not  momoryless  channels.  They  generally  exhibit  a  certain 
degree  of  memory  in  their  noise  distributions  and  the  errors  tend  to 
cluster  together  in  bursts.  Thus,  for  such  real  channels,  it  would  be 
lnappropraite  to  use  the  Viterbi  algorithm  which  is  designed  for  memory- 
lcss  channels  only.  Using  the  alqoritlus  in  its  present  form  would  result 
in  sub-optimum  performance.  In  order  to  achieve  optimum  performance,  the 
Viterbi  maximum-likelihood  decoding  algorithm  for  the  memoryless  channel 
must  be  modified  or  generalised  so  that  it  will  still  be  a  maximum-likeli¬ 
hood  decoding  algor itlun  when  used  on  such  channels  with  memory. 


2.2.  PRELIMINARIES 

The  necessary  basic  understanding  of  convolutional  codes  will  be 
presented  in  this  section.  For  further  detail  refer  to  [8].  For  ease  of 
discussion,  binary  convolutional  codes  of  rate  1/n  will  be  considered  here. 
Generalization  to  nonbinary  codes  and  any  other  rate  is  straightforward.  A 
rate  1/n  convolutional  code  encoder  is  a  linear  finite-state  machine 
consisting  of  a  (k-1) -stage  shift  register  and  n  modulo-2  adders  which 
give  the  coded  output,  where  K*n  is  the  constraint  length  of  the  code.  An 
example  with  K«3  and  n«2  is  shown  in  Fig.  1. 


Pig.  1.  A  rate  Is  encoder 


During  encoding  the  input  data  is  shifted  into  the  register  one 
bit  at  a  time,  causing  the  encoder  to  instantaneously  produce  n  encoded 
digits.  This  procedure  continues  until  L  data  symbols  are  fed  into  the 
shift  register.  The  result  is  a  code  with  a  tree  structure  having  L 
branching  levels.  Each  branch  contains  n  encoded  digits.  An  example  of 
the  tree  structure  with  L*4  is  given  in  Pig.  2  for  the  encoder  of  Pig.  1. 
Branching  upwards  corresponds  to  an  input  of  0  while  branching  downwards 
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corresponds  to  an  input  of  1. 

From  Fig.  1  we  can  observe  an  important  property  of  convolutional 
codes;  that  an  input  digit  can  affect  the  output  digits  in  at  most  K  time 
periods.  This  can  also  be  seen  from  the  fact  that  the  convolutional  encoder 
is  a  finite-state  machine  and  the  code  can  hence  be  represented  by  a  state- 
diagram  of  2K_*  states,  the  states  being  the  contents  of  the  (K-l)-stage 
shift  register.  The  state  diagram  for  the  encoder  of  Fig.  1  is  given  in 
Fig.  3. 


2.3.  GENERALIZED  VITERBI  DECODING  ALGORITHM 

In  this  section  the  Viterbi  decoding  algorithm  fur  the  memoryless 
channel  will  be  "re-invented"  using  an  approach  different  from  that  used  by 
Viterbi.  Viterbi  derived  his  algorithm  [4]  in  an  intuitive  manner  by 
observing  the  structure  of  the  convolutional  codes,  and  then  proved  that 
his  algorithm  was  actually  a  maximum-likelihood  decoding  algorithm  [5]. 

Here  we  shall  start  with  the  mathematical  expression  for  max imum- likelihood 
decoding  and  then  derive  from  it  the  mathematical  formulation  of  the  Viterbi 
algorithm.  Since  we  start  with  the  maximum-likelihood  decoding  and  arrive 
at  the  Viterbi  decoding  algorithm,  it  is  clear  that  the  Viterbi  decoding 
algorithm  is  a  max imum- likelihood  decoding  algorithm.  Using  precisely  the 
same  approach  we  will  then  derive  a  generalized  formulation  of  the  Viterbi 
algorithm  so  that  it  can  handle  channels  with  finite  memory.  A  channel 
is  said  to  be  of  finite  memory  if  its  probability  of  a  bit  in  error  is 
dependent  on  a  finite  number  of  previous  bits.  In  particular  a  channel 
is  said  to  have  memory  o  if 
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be  the  input  data  sequence  to  a  (K-l)-stage  shift 
register  encoder  CJith  h  outputs.  The  encoded  sequence  will  then  be  a  string 
of  In  symbols.  Let  X.  «  (.«  .*  ,...x  )  be  the  n-symbol  output  of  the  encoder 

when  a^  is  fed  into  tne  encoder^  and  5fet  V  ■  )  he  the  corres¬ 

ponding  received  n-sytnboi  codeword.  The  errors  added  by  tile  channel  form 
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Given  a  received  sequence  Y.\L.,.Y, ,  max imum- like It hood  decoding 
would  have  to  determine  the  data  sequence  a,a^...a,  for  which  the  likeli¬ 
hood  function  t’CY^Y^* .  .Y^  j  a^aj . .  ,a^ )  is  the  greatest  among  all  possible 

data  sequences.  In  other  words,  the  following  operation  must  be  performed; 


Max  P(Y. . * .Y  (a. . . .a. ) 
1*  •»  « 3,  1  LI  L 


(2) 


Since  knowing  a. ...a,  is  equivalent  to  knowing  (2)  is  eu-ivalent 

to  i  l 
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Max  P(Eie..ET) 
a,  . . .a-  1  L 


(3) 


V  ,*aL 


where  =  Y_^  +  X^,  i.e.,  determining  the  most  likely  error  pattern  for 
the  channel  being  used.  Using  the  identity 

P(AB)  »  P(A)P(B|A)  (4) 

(2)  can  be  written  as 

al*  *  ,aL  P^Yla  *  ,YJ &V  *  ,aL^P^YK+ll  ar  ’  *aL,Yl*  *  *YK^P^YK+2 1  al*  *  *aL*Yl*  *  *YR+1^  *  * ' 

P (Yl„i  I  ai •  •  .a^.  •  *YL-2)P(YlI  al*  •  *aL‘Yl*  •  -YL_i>  (5) 

Let  us  first  consider  a  memoryless  channel.  For  such  a  channel 
the  probability  of  receiving  Y^  depends  only  on  the  X^  that  was  sent,  which 
in  turn  is  a  function  of  only  a^  ic+j_s,,a^»  ^»e«» 

P(Yil al“ * ,aL’Yl* * ,Yi-l)  "  P(Yilai-K+l**°ai)  (6) 

Substituting  (6)  into  (5)  we  can  rewrite  (5)  as: 

Max  ?  (Y1 . . .  Yr|  . . .  aK>p  (YK+1 1  a2  •  •  ,aK+l^  *  * ,P  ^YL-1 1  ^-K*  ’  *  ^-1^ 

3.^  •  n  • 

p<Tilam+i— *i>  (7) 

In  (7)  we  note  the  fact  that  the  second  and  higher  terms  are  independent 
of  a^,  and  the  third  and  higher  terms  are  independent  of  a2>  etc.  We 
can  regroup  (7)  into  the  form: 

Max  P(YLlVK+l”,aL)  {afaJ  P(YL-iIVk*'*ViH  *** 

ai  -m:  *  °ah  L“K 

...  {Max  P(YK+1la2. ••aK+1)  {Max  PCY^ ..YK|a1...aK)}}.. .,}}  (8) 

In  (8)  each  maximization  procedure  is  over  a  single  variable  except 
the  first  one,  which  is  over  K-l  variables  ar_v+i,,,aL’  **  during  encoding 
we  agree  to  add  K-l  zeros  to  the  end  of  the  data  sequence  a^...a^,  (K-l)n 
more  digits  will  be  transmitted  and  received,  namely  X^^. .  and 

YL+1* 4 ,YL+K-1  re8Pectively‘  This  additional  sequence  does  not  contain  any 

new  information  but  will  be  seen  shortly  to  be  very  helpful  in  simplifying 
the  decoding  procedure.  The  maximum-likelihood  decoding  procedure  of  (2) 
would  now  be 

8 


‘ £’i3?'<3K.'<> . jSS&'ws’’  ?&  -«y  &n?-ucv*  - 


Max  P(Y1Y2...YL+K_1|a1...aL  0...0  (9) 
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K-l 


The  K-l  zeros  in  (9)  may  be  omitted  since  they  are  known  quantities. 

Following  the  same  procedure  as  before,  an  expression  similar  to  (8) 

can  be  derived  from  (9)  except  that  Y  is  now  replaced  by  YT...Y  , 
j  -  Jj  L  LfrK-l 

XiCt  ) 

Vk+I  P(Yb“*Yb+K-llaL-K+l*“aL){  P(YL-ll  Vk*  *  {  * '  * 

...{aMax  P(yK+1!a2...aK+1){aMax  P(Y1> . .Yja,. . .aR) }} . . .}}  (10) 

Using  (4)  and  (6)  P<V  ’ -WK-  ■Y+l"**!?  can  again  be  broken  up  into 
K  terms : 


P ( YL ‘  *  *  YL+K-1 1  ®L-K+1  ,aL^  "  P  (  Y  J  aL-K+l *  *  * ®L* 1 P (YL+1 I  \-K+2  *  *  ‘ aL)  *  *  * 


. . .  P (\+K_2 1  ^-l3^  P  (Yl+K-1  I  V  ( 11^ 

Noting  once  again  that  the  second  and  higher  terms  of  (11)  are  Independent 
of  ^-K+i*  etc>»  the  maximization  process  over  aL_R+^...aL  in  (10)  can 
once  again  be  broken  into  K  maximizations  each  over  a  single  variable; 

aMax  P(Yl+K-lK?{  a^  P(YL+K-2laL-laL){  *** 

L  «  X 

^aL-K+l  P<YL*0L-K+1‘’‘V{  P(YL-llaL-K*,,aL-l){  *“ 

...(aMax  P(YK+1|a2...aK+L){aMax  P<Y1...YK|al...aK)})...}  (12) 

2  1 

Equation  12  is  the  mathematical  expression  of  a  maximum-likelihood 
decoding  procedure  for  the  convolutional  code  over  a  memory less  channel.  Not 
too  surprlsing-y  it  is  the  same  as  the  Viterbi  decoding  algorithm.  The 
decoding  procedure  as  represented  by  (12)  may  be  interpreted  in  the  following 
way: 

Step  1-  Compute  the  likelihood  functions  P(Y^...YKja^...a^)  for  all 
2  possible  paths  a^,..a^.  For  each  of  the  2k“l  paths  a2>..aK  choose 

that  a^'  which  gives  the  greatest  likelihood  function  and  call  it  the 
survivor  A^a,. ,.aK)  •  o  '. 


I 


Step  2i  Compute  the  likelihood  functions  ^(Y  ^  I  a2  *  •  ,aK+i  ^  an^ 

multiply  by  their  corresponding  previous  likelihood  functions 
P(Y1...YK|A1(a2...aK)a2...aK)  to  form  the  new  likelihood  function 

P(Y1...YK+L|A1(a2...aK)a2...aK+1)  for  all  2K  possible  paths 


a2.,,a^+^.  For  each  path  a^...a^+^  choose  that  a2'  which  gives 
the  greatest  likelihood  function  and  call  the  sequence  A^Ca^a^.. 


the  survivor  ^(a^ 


...aK+1). 


Step  3  —  Step  L-K+l?  Proceed  in  a  similar  manner  as  in  Step  2. 
In  particular  at  the  i-th  step.  3  <_  i  £  L-K+l,  compute  the  2 
likelihood  functions  ^  ai* '  *ai+K-l^  ant*  by  their 

corresponding  previous  likelihood  functions  P(Y^. . | 
Ai_1(ai.. •ai+K_2)ai...a1+K_2)  to  form  the  new  likelihood  function 


P(Yl”*Yi+K-l'Ai~l(ai 


*,,aifK-2) 


ai...ai+K_1)  for  all  possible 


paths  a^.  *‘ai+K-l*  Por  eac^  Pat*-  ai+l*  *  *ai+K-l  clloo8e  ai' 
which  gives  the  greatest  likelihood  function  and  call  the  sequence 


S.-l(ai'al+l*,,ai+K-2*a 


i  the  survivor  A^(ai+^ 


,ai+K-l^  * 


step  L-K+2  —  Step  L-l:  Proceed  in  a  similar  manner  as  before, 
except  that  the  length  of  the  path  is  shortened  by  one  at  the 
end  of  each  step.  In  particular  at  the  i-th  step,  L-K+2  <_  i  <_  l-l , 
compute  the  zL+l-i  likelihood  functions  for  all  possible  paths 
For  each  path  a£+^***aIJ  choose  that  a^  which  gives 

the  greatest  likelihood  function,  and  call  the  sequence 
\i_l(ai'...aL)ai'  the  survivor  ^(a^. .  .a^) . 


Step  h:  Compute  the  2  likelihood  functions  for  aj  «  0  and  **  1. 

Choose  that  a.  '  which  gives  the  greater  likelihood  function.  The 
survivor  sequence  ^ (a^ 1  )a^ '  is  the  maximum-likelihood  decoded 

sequence. 

From  the  above  procedure  it  can  be  seen  that  at  each  step  of  the 
decoding,  except  the  final  K-l  steps,  maximization  has  to  be  done  for  2^-1 
different  paths,  Since  there  are  also  2K“1  states  in  the  state-diagram  of 
the  code,  the  state-diagram  can  be  used  as  a  system  diagram  for  the  decoding 
algorithm.  At  the  cud  of  each  decoding  step  each  state  (path)  remembers 
its  survivor  and  corresponding  likelihood  function.  During  the  next  decoding 
period  ali  the  possible  state  transitions  ore  made  and  the  corresponding  new 
likelihood  functions  computed.  Then  at  each  state  the  new  likelihood 
functions  are  compared  and  the  new  survivor  is  chosen,  and  the  system  is 
ready  for  another  decoding  period.  During  the  final  K-l  steps  the  same 
thing  happens,  only  that  now  a  decreasing  number  of  states  would  be  involved 
and  an  increasing  number  of  states  would  become  idle. 
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; 

f; 


We  have  seen  so  far  that  the  Viterbi  decoding  algorithm  for  the 
memoryless  channel  can  be  directly  derived  from  the  general  maximum-likeli¬ 
hood  decoding  formulation  (2)  by  making  use  of  properties  (6)  of  the 
memoryless  channel.  Using  precisely  the  same  approach,  we  shall  now  show 
that  the  Viterbi  decoding  algorithm  can  easily  be  generalized  to  handle 
channels  with  finite  memory. 


Let  us  consider  a  channel  with  finite  memory  m.  Let  J  be  the 
smallest  integer  greater  than  or  equal  to  m/n.  For  such  a  channel,  the 
probability  of  receiving  the  initial  sequence  Y^...Yj  will  not  only  depend 
on  a^..,aj,  but  also  on  the  error  state  • *e_ieo  of  t*ie  channel  before 


the  first  digit  y. .  is  received.  If  we  agree  to  transmit  ra  zeros  just 
before  we  transmit  the  first  coded  digit  x^,  the  received  digits  corres- 
por  ing  co  these  m  zeros  would  tell  us  the  error  state  e_m+^..,eQ  of  the 

channel.  Then  the  maximum-likelihood  decoding  formulation  of  (2)  can  be 
slightly  modified  to 


Max  P(Yt . .  .Y.  ja. . .  .a. ,  e 

Li.  Li 


"1 


-m+1 


...eQ) 


(15) 


Furthermore  if  during  encoding,  we  agree  to  add  K+J-l  zeros  to  the  end 
of  the  data  sequence  (K+J-l)n  more  digits  will  be  transmitted 

and  received,  namely  X^. .  •X.+R+J_1  and  Y^. .  •yl+k+j_1  respectively. 

Juat  as  in  the  memoryless  channel  case,  this  additional  sequence  does 
not  contain  any  new  information  but  will  also  be  seen  to  simplify  the 
decoding  procedure.  The  maximum-1 ikelihood  decoding  procedure  of  (15) 
would  now  be 

Man  P0f1***\+K+j^1tal...aL,  ew0)H...a0).  (16) 

al*.*aL 


Applying  identity  (6)  to  (16)  as  before,  (16)  can  bo  written  as 
1  *  *  *  YkfJ !  V  ‘  \  ’e-urH  *  *  *  V  P(YK+J-fllal*,*aL,Yl** 

'  •°0)  **  ,P(YL-H  +J-2^1"  ‘  VV*  *YL+K+J~3*e-m+l'  **V 


P ( *W+J-1 1 ' al  *  ‘  ,aL ’ ' Y1  ‘  ‘  ‘ YL+K-M-z » C-ra+l *  *  ’  V  a7) 

For  the  channel  with  memory  ra,  the  probability  of  receiving  Y  ,  i  >  J, 
will  depend  not  only  on  X.,  which  gives  the  error  pattern  + 

but  also  on  the  previous  error  sequence  •  (Y^j...Y^_j)  +1 

aw..,XH).  The  sequence  X^j.,.X^  as  a  whole  is  a  function  of 
Thus  for  this  channel 
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UrtMMWsw,,-  .twium-o,  ;»c  rti^^^Opvf^-OSNjftaf  tWa?-££2£Z&*S&-wr-Kr4*fi 


BiV»SKWW>S >«r4*>»* 


P(Y, 


|  a^. . 


•V1] 


...Y 


i-1’  -m+1 


•  •  ,a/\) 


P(Yilai-J-K+l***ai’Yi-J*“Yi-l) 


(18) 


The  likelihood  function  on  the  right-hand  side  of  (18)  is  easy  to  compute 
since 


P  ^Yi ' ai-J-K+l *  *  *  ai » Yi-J ’ *  * Yi-1^ 


P(YJX1.J...Xi,Yi_j...Yi_i) 


-  P(Y1  +  Xii(Yi_J...Yi_i)  +  (X 

-  P(Ei|Ei_J...Ei_1). 


i-J 


•  • «X^__^) ) 


(19) 


which  can  be  computed  from  the  channel  model  parameters.  Substituting 
(18)  into  (17)  we  can  rewrite  (17)  as 


Max  P(Yl**,YK+jlai**,aK+j»e-m+i”>eo)P(YK+J+lla2 
x’  *  *  L 

*  *  *P(YL+K+J-2 1  aL-lVYL+K-l*  *  *YL+K+J-3)P(YL+K+J-1 
YL+K-l*‘,YL+K+J-2 


•\+j+i,yk+i 


'  aL  * 


,YK+J^ 


(20) 


Once  again  we  note  the  fact  that  the  second  and  higher  terms  of  (20)  are 
independent  of  a  and  the  third  and  higher  terms  are  independent  of  a„, 
etc.,  we  can  regroup  (20)  into  the  form; 


Max  p(Y1/n«+j_1 1  aL*  YL+K~1 
L 

YL+K-2***YUK+J-3^  *** 
{aMax  P(Yr..YK+J|a 


*’*YL+K+J-2^aJ^ 

(flMax  P(YK+J+1|a 

1  *  *  ’  aK+J  *  e-m4*l 


P(YL+K+J-2 1  Vl3! 

2‘ * ,aK+J+l,YK+l* ’ *Y 
eQ) })...)} 


> 


K+J 


){ 

(21) 


Equation  (21)  is  the  mathematical  expression  of  a  maximum-likeli¬ 
hood  decoding  procedure  for  the  convolutional  code  over  a  channel  with  finite 
memory  m.  We  shall  coll  it  the  generalized  Viterbi  decoding  algorithm.  The 
decoding  procedure  as  represented  by  (21)  may  be  interpreted  in  the  following 
way; 


Step  1:  Compute  the  likelihood  functions  P(Y^...Y^+jja^...aj,+J, 

K+J 

e_o+^...e0)  for  all  2  possible  paths  ai***\+j*  Por  each 
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K+J— 1 

the  2  paths  a2**,aK+j  choose  that  a^’  which  gives  the 
greatest  likelihood  function  and  call  it  the  survivor 


Al<a2**,aK+J) 


*1  ‘ 


Step  2:  Compute  the  likelihood  function  p(\+j+]Ja9  •  •  ,aK+j+j.’ 
YK+1’*,YK+J^  anc*  mlt±^y  by  their  corresponding  previous  likeli¬ 
hood  functions  PO^. . *YK+jlAl^a2* ‘ ,aK+J^a2* * ,aK+J*e-m+l* * *e0^  t0 
form  the  new  likelihood  function  P(Y^. .  I A^(a2* ••aK-hPa2*  •  • 

aRM+l»®-«fl‘,*e0)  f0r  311  2K+J  P°ssible  Paths  a2*‘  ,aK+J+l* 

For  each  path  a3***aK+j+i  choose  that  a2'  which  gives  the  greatest 

likelihood  function  and  call  the  sequence  A1(a„,a-...a„  T)a_'  the 
survivor  A2(a3...aK+J+1). 


Step  3  —  Step  L-K-J+l;  Proceed  in  a  similar  manner  as  in 
Step  2.  In  particular  at  the  i-th  step,  3  i  <_  L-K-J+l,  compute 
the  2k  likelihood  functions  p0fi+K+j_i I ai* *~ai+K+j-i ,Yi+K-l* *  * 


Yi+K+J-2^  an<*  by  the  corresponding  previous  likelihood 

functions  •  ^^+^2  lAi.i(af  ••*1+|t+j.2^ai*  *  **i-m+J-2»e-m-i' * 
to  form  the  new  likelihood  function  PCY^. .  *  “ai-*-K.+J-2^ 

ai *  *  * ai+K+J- 1  * e-m+l *  *  * ®Q^  for  aU  P°88ible  Path8  V * -ai4K4J-l*  For 
each  path  a^+1< , choose  that  at'  which  gives  the  greatest 

likelihood  function  and  call  the  sequence  A^ta^'a^. 


the  survivor  \(a1+1*  •  ^t+K+J-l*  • 


Step  L-R-J+2  21  Step  L-l;  Proceed  in  a  similar  manner  as  before, 
except  that  the  length  of  the  path  is  shortened  by  one  at  the  end 
of  each  stop.  In  particular,  at  the  i-th  step,  L-K»J+2<isJL-l,  com¬ 


pute  the 
For  each  path  a 


likelihood  functions  for  all  possible  paths  a,..,iL. 
^...a^  choose  that  a^'  which  gives  the  groatest  L 


likelihood  function,  and  call  the  sequence  A,  .(a.* . ..a,  )a, 1  the 
survivor  At(at+l.. •  •L>» 


Step  L:  Compute  the  2  likelihood  function  for  a,  *  0  and  »  1. 
Choose  that  a,'  which  gives  the  greatest  likelihood  function.  The 
survivor  sequence  /L  (a.  * )a. '  is  the  maximum- likelihood  decoded 
sequence. 


From  the  above  it  can  easily  be  seen  that  the  decoding  procedure  as 
represented  (21)  is  indeed  a  generalized  Viterbl  decoding  algorithm  since  it 
contains  the  Viterbl  algorithm  as  a  special  case  when  the  memory  of  the  channel 
m  is  equal  to  zero.  For  the  channel  with  memory,  it  is  seen  that  at  each  step 
of  the  decoding,  except  the  final  K+J-l  steps,  maximisation  has  to  be  done  for 


K*KJ"1  K*1 

2  different  paths.  Since  there  are  only  2  states  in  the  state-diagram 
of  the  code,  the  state-diagram  cannot  be  used  as  a  system  diagram  for  the  decoding 
algorithm  as  in  the  memoryless  channel  case.  However,  for  any  2^"*-states  state- 
diagram  it  is  possible  for  one  to  expand  it  into  a  2^^”^-states  state-diagram. 

The  easiest  way  to  see  this  is  to  look  at  the  encoder.  One  can  add  J  stages  of 
dummy  shift  register  to  the  original  K-l  stages  of  shift  register  in  the  encoder 
and  then  consider  it  as  a  K+J-l  states  finite-state  machine.  This  ’'new"  encoder 
can  thus  now  be  represented  by  a  state-diagram  with  2K+^“^  states.  An  example 
of  such  a  procedure  for  the  encoder  of  Fig,  1  with  J  «  1  is  as  shown  in  Fig.  4. 

The  expanded  state-diagram  obtained  by  the  procedure  just  described  may  now  be 
used  as  a  system  diagram  for  the  generalized  Viterbi  algorithm  in  exactly  the 
same  manner  as  in  the  memoryless  channel  case.  Thus  the  complexity  of  a 
generalized  Viterbi  decoder  with  parameters  K  and  J  is  about  the  same  as  that 
of  a  Viterbi  decoder  for  the  memoryless  channel  with  parameters  K'  ■  K+J. 


2.4  EXAMPLE 

As  a  very  primitive  example  of  comparing  the  performance  of  the 
generalized  Viterbi  decoder  with  that  of  a  Viterbi  decoder  for  the  memoryless 
channel  of  the  same  complexity,  let  us  use  the  code  generated  by  the  K  •  3 
encoder  of  Fig,  1  for  the  memoryless  Viterbi  decoder  and  use  the  code  generated 
by  the  K  “  2  encoder  of  Fig,  5  for  the  generalized  Viterbi  decoder.  Assume 
L  *  3  in  both  cases.  Furthermore  let  us  choose  a  very  simple  channel  model 
representing  a  channel  with  finite  memory  m  •*  2.  Such  a  model  is  completely 
specified  by  the  following  set  of  conditional  probabilities: 

P(i|00)  -  10"3 

P(1  jOl)  «*0.5  (22) 

P(ljlO)  -  0.5 
P(l|ll)  •  0.5 

where  a  1  indicates  a  channel  error  and  a  0  indicates  no  error.  Since 
m  ■  n  «  2,  J  *  1  for  the  generalized  Viterbi  decoder  and  the  two  decoders 
have  the  same  degree  of  complexity  with  regard  to  hardware  configuration  and 
number  of  states. 


tot  us  assume  that  the  channel  is  error-free  when  the  actual  trans¬ 
mission  begins,  i.e.,  0^  ■  o  ,  ■  0.  Since  both  of  the  codes  are  linear  codes, 
we  can  form  a  standard  array "for  each  of  the  codes.  For  the  K  ■  3  code,  choose 
the  vector  with  the  minimum  weight  in  each  coact  as  the  coset  leader.  These 
would  be  the  correctable  error  patterns  chosen  by  the  meaoryless  Viterbi 
decoder.  For  the  K  *  2  code  choose  the  vector  with  the  highest  probability, 
given  eg  ■  e  -  «*  0,  in  each  coset  as  its  coset  leader.  These  would  be  the 
correctable  error  patterns  chosen  by  the  generalized  Viterbi  decoder. 

If  no  coding  is  used  at  all  in  the  channel,  the  probability  of 
error  Pg  is 
n 

P£  -  l  -  P(00...0|00) 

°  -3 

“9.955  X10  J 
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Data 

Sequence 


If  the  K  =  3  code  is  used  in  conjunction  with  the  oemoryless  Viterbi  decoder, 
the  probability  of  error  Pg  is 

m 

P„  =  1  -  P(0...0|00)  -  2  P(ElOO) 

m  EeS 

-  3.549  X10'3 

where  S  is  the  set  of  all  correctable  error  patterns.  For  the  K  ■  2  code  , 
and  the  generalized  Viterbi  decoder,  the  probability  of  error  Pg  ■  2.583  X10  . 

8 

Thus  the  generalized  Viterbi  decoder  slightly  outperforms  the  memoryless  channel 
Viterbi  decoder. 

2.5.  CONCLUDING  REMARKS 

In  this  report  we  have  developed  a  generalized  version  of  the  Viterbi 
decoding  algorithm.  This  generalized  algorithm  can  be  used  to  perform  maximum* 
likelihood  decoding  on  any  channel  with  finite-memory  and  is  thus  an  optimum 
decoding  algorithm  for  such  channels.  It  is  pointed  out  that  the  complexity 
of  a  generalised  Viterbi  decoder  with  parameters  K  and  J  is  about  the  same  as 
that  of  a  Viterbi  decoder  for  the  memory less  channel  with  parameter  K'  ■  K+J, 

In  a  simple  example  we  have  seen  that  the  generalized  Viterbi  algorithm  indeed 
outperforms  the  memoryless  Viterbi  algorithm  when  the  complexities  of  the  two 
decoders  are  about  the  same.  Although  the  same  result  has  not  yet  been  proved 
to  be  true  for  all  channels  with  finite  memory  and  all  possible  codes,  it  shows 
at  the  very  least,  that  in  certain  cases  the  generalized  Viterbi  decoding 
algorithm  is  superior  to  the  memoryless  Viterbi  algorithm  when  the  complexity 
of  each  decoder  is  kept  the  same.  Furthermore,  if  it  is  the  complexity  of  the 
encoder  rather  than  the  decoder  that  is  kept  the  same,  then  the  generalised 
Viterbi  algorithm  which  performs  maximum-likelihood  decoding  will  definitely 
be  superior.  Thus,  in  those  systems  where  the  cost  of  the  encoder  is  the  main 
concern  and  the  cost  of  the  decoder  is  of  little  concern,  the  generalized 
Viterbi  algorithm  should  definitely  be  used.  Just  one  such  example  is  when 
there  is  a  large  number  of  sources  (thus  encoders)  transmitting  data  to  a 
data  processing  center  which  uses  a  general  purpose  computer  as  decoder  and 
thus  the  cost  of  implementing  either  decoding  algorithm  is  the  same. 
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SECTION  3 

MARKOV  CHAIN  MODEL 


3.1.  A  REVIEW  OF  THE  M-STATE  MARKOV  MODEL 

The  M-state  Markov  model  Is  described  by  the  transition  probability 

1  -  h 


matrix  P  *  {p..}  and  a  set  of  probabilities  of  error  in  each  state  hi  -  *  -  u., 
M  x  M  parameters  are  required  to  determine  the  model  completely.  It^has  beenJ 


contemplated  that  a  set  of  the  coefficients  A.  and  A  (n  ,n  .)  of  the  unconditional 
and  conditional  gap  distributions  can  be  used^o  identify  these  parameters. 


However ,  a  close  examination  revealed  that  the  process  of  identification  is 
considerably  more  complicated  than  it  has  been  anticipated. 


The  elements  of  the  matrix  D  are  defined  [1]  by 


Du  *  puhi 


a) 


D  can  be  diagonalized  as  follows: 


D  ■  6  g  G 


-1 


(2) 


where  ^  is  a  diagonal  matrix  whose  elements  are  the  eigenvalues  of  the 
matrix  D.  G  is  a  non-singular  matrix  whose  columns  are  eignevectora  cor¬ 
responding  to  a..  Since  the  eigenvectors  are  unique  up  to  a  scalar,  they 
can  be  chosen  such  that  the  sum  of  the  columns  of  G  is  equal  to  unity,  i,e. 


M 


E  Gtj  *  1  »  J  “ 


1-1 


(3) 


It  has  been  shown  [2]  that 


M 

P(m)  «  £  A.  a. 
i-1  1  1 

(4) 

M  _ 

P(tt/n)  -  £  A.  (n)a. 

i-1  1  1 

(5) 

and 


M 


P(a/nj  -g  n  <  r.j+1)  -  £  Ai^'j*nj+I)ai 

i**l 

It  has  also  been  shown  £3]  that 


a 


(6) 


P(m)  -  x^D°e 


(7) 
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P(m/n) 


(8) 


AVd”. 

x  q  T  n+1 
xDe  -  x  D  e 


j  3tTD°D  Dme 


PCa/n  <  n  <  uj+l>  -  — jT  n  « 

xAD  Je  -  x  D  J  e 

Consider  the  esse  of  M  *  3»  by  definition 


P(m)»  x_^Dme_  “  x? GamG 


W  k  •  «\  K'W1' 


“  <*x*a*sl  j  .  (-1)  (-x>c  (-1) 

1  G2LG22G23  1  0tt2  0  II  C2l  22  °23  \ 

\  /  \„  „  »/  \,  (-l)r  (-Mc  (-«  / 

V}1  Vi/  ^  °  V  \U  M  °»  / 

*Vo  "  (On*"1’  +  <:1]<’1>  +  °l3l'1))C'tlGll+*2G2i  +  >t3c31)  “l 

+  (0n<-»  +  C22<-»  +  023("l)H*iG12  +  *2G22  +  "jV  “2" 

+  <SlM>  +  C32<'U  +  C33l'l>)<XlCU  +  *2°23  +  *3G33>  < 


nhi  pihi 

‘l  "  3  fe 

S  pH' 
i-1  ^  ^ 


where  pt  is  the  stationery  probability  of  i-th  state.  The  Pi  satisfy  the 
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following  relations: 


11P12P13 


P2  f “I  p21p22p23  I  P2 


31P32P33  •  \?3 


P1  +  P2  +  P3  "  1  U3] 

The  stationary  probabilities  p.  can  be  expressed  in  terms  of  the  transition 
probabilities  p^  as  follows: 

_ _ (I’P22)(1“P33'  *  P23P32 _ . 

1  (lp22)(1“p33)  +  P23P32  +  P21(1“p33)  +  P23P31  +  p31(1”p22)  *  p32p21 

m  ^^ll^1*1^  *  P13P31 _  ^ _ 

1  +  P13P31  +  * p13p32* 

(1“P^^)  (l**pij2^  ^  1*^21 

Ps  “  a-pu)a“p22)  +  pup21  +  p^c^)  +  p^jTp^^  +  p2lPl3  (16) 


The  probability  transition  matrix  P  can  be  expressed  in  taros  of 
0  and  in  turn*  expressed  in  terms  of  0  and  G“l. 


1  -11 
P-D-*G«Gr 

!i  ’  k 


or  explicitly 


pllp12p13 


p21p22p23 


;UG12G13 


G21G22G23 


P31P32P33'  XG31G32G33'  \°  0(1 


Cll<‘1><:U<‘l,G13<'l\  A  0  0 


0  a  0  G  (-1)G  (-1)G  (-1)  M  0  ~  0 
2  zl  22  23  II  h2 


'°  0  t 


a 


'  a  v.  -*^c*’"'^:-}v,  Vj,  -,  ^^»?{5V3^5s%FS{5!7B’»r- 


*39!@?f<W}>1^S'5»SW,*JSt*y,JWTafjr«V* :  • 


Carrying  out  the  matrix  multiplication,  the  result  is 


?ij  hj  (alGilGij 


+  a2^i2^2j 


(“D  +  a  G  G 
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Z 
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(17) 


Substituting  (17)  into  (14),  (15),  and  (16),  into  (11);  and  into  (10),  it 
is  not  difficult  to  realize  that  the  magnitude  of  complexity  is  enormous. 


3.2.  AN  ALTERATIVE  APPROACH 

The  complexity  in  identifying  the  parameters  of  the  general  Markov 
model  stems  from  the  fact  that  there  are  too  many  parameters  to  be  determined. 
Some  of  to  parameters  could  be  specified  beforehand  as  0  or  1  to  reduce  the 
complexity.  Several  simpler  models  were  discussed  in  both  Quarterly  Reports 
No.  2  and  No.  4.  Among  them  the  simple  partitioned  Markov  chain  model  has  been 
extensively  studied  [4], [5], [6],  This  model  is  completely  determined  by  2(M-1) 
parameters  instead  of  M^  parameters  for  the  general  Markov  model,  and  these 
2(M-1)  parameters  can  be  uniquely  derived  from  the  unconditional  gap  distribu¬ 
tion  which  is,  sometime,  referred  to  as  error  free  run  distribution. 


The  general  Markov  model,  were  it  possible  to  be  derived  from  the 
unconditional  and  conditional  gap  distributions,  would  yield  the  same  uncondi¬ 
tional  gap  distribution  as  the  simple  partitioned  Markov  chain  model.  It  would 
also  yield  the  conditional  gap  distributions  while  the  simple  partitioned 
Markov  model  will  not  exhibit  the  interdependence  between  the  gaps  because 
it  has  only .a  single  error  state. 


Investigation  of  the  effect  of  these  second  order  statistics, 
namely,  the  interdependence  on  the  error  burst  distribution  is  under  way. 

A  computer  program  has  been  written  to  generate  error  sequences  from  the 
probability  transition  matrix  characterizing  the  simple  partitioned  Markov 
model.  Burst  distributions,  are  calculated  from  three  sources:  the  original 
error  sequence,  the  error  sequence  generated*  from  the  gap  model  and  the  error 
sequence  generated  from  the  simple  Markov  model.  Some  preliminary  results 
have  been  obtained  and  are  presently  under  study. 

3.3.  ERROR  BURST 

In  Quarterly  Report  No.  4,  the  error  burst  is  defined  [7]  as  a 
sequence  of  bits  starting  and  ending  with  an  error  and  separated  from 
neighboring  bursts  by  at  least  K  error  free  bits,  where  K  is  a  parameter. 

A  second  dexinition  [6]  is  instrumental  in  evaluating  some  error  correcting 
codes.  It  defines  the  ei  or  burst  with  error  density  Ao  as  follows: 

(1)  A  burst  begins  with  an  error  and  ends  with  an  error; 


(2)  The  ratio  of  the  number  of  errors  to  the  total  number  of 
bits  of  a  burst  is  larger  than  or  equal  to  the  specified 
density  Ao; 

(3)  If  successive  inclusion  of  the  next  error  keeps  the  error 
density  above  Ao,  the  burst  continues;  otherwise  the  burst 
ends; 

(4)  A  burst  cannot  begin  with  an  error  belonging  to  the  previous 
burst; 

(5)  A  single  error  is  defined  to  be  a  single-error  burst  with 
burst  length  of  one  digit. 

A  computer  program  has  been  written  to  compute  the  burst  distribu¬ 
tion  from  the  error  sequence  using  the  second  definition.  Some  preliminary 
result  on  the  burst  distribution  from  the  original  error  data  is  shown  in 
the  following  figure. 

The  two  different  definitions  do  not  cause  appreciable  change  in 
the  burst  statistics.  This  can  be  seen  by  the  following  reasoning.  The 
first  definition  does  not  allow  a  long  string  of  11  o"  in  the  burst  while 
the  second  definition  does.  However,  the  probability  of  these  bursts  is 
small  because  a  long  string  of  "o"  inside  the  burst  must  be  preceded  by 
dense  errors. 
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The  Viterbi  decoding  algorithm  yields  minimum  probability  of  error  when 
applied  to  a  memoryless  channel  provided  that  all  input  sequences  are  equally 
likely.  In  this  report,  the  algorithm  was  generalised  for  application  to  channels 
with  finite  memory  and  it  was  shown  that  the  generalised  algorithm  la  also  maximum 
likelihood  decoding.  It  was  also  shown  that  the  generalised  Viterbi  algorithm  on 
a  simple  memory  channel  performs  better  than  the  original  Viterbi  algorithm  with 
the  same  decoding  complexity. 

The  M-state  Markov  model  was  reviewed  in  this  report.  The  process  of 
identifying  the  parameters  of  the  M-state  model  from  the  coefficients  A^  and  A^ 

(n, ,  n.+.)  of  the  gap  model  was  determined  to  bo  more  complicated  than  was 
anticipated.  An  alternative,  the  simple  partitioned  Markov  model  was  examined 
to  determine  the  effect  of  the  second  ord<  r  statistics,  namely  the  interdependence 
of  the  gaps,  on  the  error  burst  distribution.  An  alternative  definition  of  the 
burst  was  adopted  to  speed  up  this  Investigation.  The  difference  or  similarity 
between  these  two  definitions  will  be  determined. 
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